Search CORE

56 research outputs found

Generalized Zurek's bound on the cost of an individual classical or quantum computation

Author: Kolchinsky Artemy
Publication venue
Publication date: 08/05/2023
Field of study

We consider the minimal thermodynamic cost of an individual computation, where a single input

x

is mapped into a single output

y

. In prior work, Zurek proposed that this cost was given by

K(x\vert y)

, the conditional Kolmogorov complexity of

x

given

y

(up to an additive constant which does not depend on

x

y

). However, this result was derived from an informal argument, applied only to deterministic computations, and had an arbitrary dependence on the choice of protocol (via the additive constant). Here we use stochastic thermodynamics to derive a generalized version of Zurek's bound from a rigorous Hamiltonian formulation. Our bound applies to all quantum and classical processes, whether noisy or deterministic, and it explicitly captures the dependence on the protocol. We show that

K(x\vert y)

is a minimal cost of mapping

x

y

that must be paid using some combination of heat, noise, and protocol complexity, implying a tradeoff between these three resources. Our result is a kind of ``algorithmic fluctuation theorem'' with implications for the relationship between the Second Law and the Physical Church-Turing thesis

arXiv.org e-Print Archive

Dependence of dissipation on the initial distribution over states

Author: Kolchinsky Artemy
Wolpert David H.
Publication venue: 'IOP Publishing'
Publication date: 22/08/2017
Field of study

We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given by a simple information-theoretic function that depends only on the initial and final state distributions. Crucially, this difference is independent of the details of the process relating those distributions. We then consider how dissipation depends on the initial distribution for a 'computer', i.e., a nonequilibrium process whose dynamics over coarse-grained macrostates implement some desired input-output map. We show that our results still apply when stated in terms of distributions over the computer's coarse-grained macrostates. This can be viewed as a novel thermodynamic cost of computation, reflecting changes in the distribution over inputs rather than the logical dynamics of the computation

arXiv.org e-Print Archive

Estimating Mixture Entropy with Pairwise Distances

Author: Kolchinsky Artemy
Tracey Brendan D.
Publication venue: 'MDPI AG'
Publication date: 01/07/2017
Field of study

Mixture distributions arise in many parametric and non-parametric settings -- for example, in Gaussian mixture models and in non-parametric estimation. It is often necessary to compute the entropy of a mixture, but, in most cases, this quantity has no closed-form expression, making some form of approximation necessary. We propose a family of estimators based on a pairwise distance function between mixture components, and show that this estimator class has many attractive properties. For many distributions of interest, the proposed estimators are efficient to compute, differentiable in the mixture parameters, and become exact when the mixture components are clustered. We prove this family includes lower and upper bounds on the mixture entropy. The Chernoff

\alpha

-divergence gives a lower bound when chosen as the distance function, with the Bhattacharyya distance providing the tightest lower bound for components that are symmetric and members of a location family. The Kullback-Leibler divergence gives an upper bound when used as the distance function. We provide closed-form expressions of these bounds for mixtures of Gaussians, and discuss their applications to the estimation of mutual information. We then demonstrate that our bounds are significantly tighter than well-known existing bounds using numeric simulations. This estimator class is very useful in optimization problems involving maximization/minimization of entropy and mutual information, such as MaxEnt and rate distortion problems.Comment: Corrects several errata in published version, in particular in Section V (bounds on mutual information

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

A novel approach to multivariate redundancy and synergy

Author: Kolchinsky Artemy
Publication venue
Publication date: 06/02/2020
Field of study

Consider a situation in which a set of

n

"source" random variables

X_{1},\dots,X_{n}

have information about some "target" random variable

Y

. For example, in neuroscience

Y

might represent the state of an external stimulus and

X_{1},\dots,X_{n}

the activity of

n

different brain regions. Recent work in information theory has considered how to decompose the information that the sources

X_{1},\dots,X_{n}

provide about the target

Y

into separate terms such as (1) the "redundant information" that is shared among all of sources, (2) the "unique information" that is provided only by a single source, (3) the "synergistic information" that is provided by all sources only when considered jointly, and (4) the "union information" that is provided by at least one source. We propose a novel framework deriving such a decomposition that can be applied to any number of sources. Our measures are motivated in three distinct ways: via a formal analogy to intersection and union operators in set theory, via a decision-theoretic operationalization based on Blackwell's theorem, and via an axiomatic derivation. A key aspect of our approach is that we relax the assumption that measures of redundancy and union information should be related by the inclusion-exclusion principle. We discuss relations to previous proposals as well as possible generalizations

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

PubMed Central

Caveats for information bottleneck in deterministic scenarios

Author: Kolchinsky Artemy
Tracey Brendan D.
Van Kuyk Steven
Publication venue
Publication date: 08/02/2019
Field of study

Information bottleneck (IB) is a method for extracting information from one random variable

X

that is relevant for predicting another random variable

Y

. To do so, IB identifies an intermediate "bottleneck" variable

T

that has low mutual information

I(X;T)

and high mutual information

I(Y;T)

. The "IB curve" characterizes the set of bottleneck variables that achieve maximal

I(Y;T)

for a given

I(X;T)

, and is typically explored by maximizing the "IB Lagrangian",

I(Y;T) - \beta I(X;T)

. In some cases,

Y

is a deterministic function of

X

, including many classification problems in supervised learning where the output class

Y

is a deterministic function of the input

X

. We demonstrate three caveats when using IB in any situation where

Y

is a deterministic function of

X

: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of

\beta

; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal. We also show that when

Y

is a small perturbation away from being a deterministic function of

X

, these three caveats arise in an approximate way. To address problem (1), we propose a functional that, unlike the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the three caveats on the MNIST dataset

arXiv.org e-Print Archive

Semantic information, autonomous agency, and nonequilibrium statistical physics

Author: Kolchinsky Artemy
Wolpert David H.
Publication venue: 'The Royal Society'
Publication date: 07/11/2018
Field of study

Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state. We also use recent results in nonequilibrium statistical physics to analyze semantic information from a thermodynamic point of view. Our framework is grounded in the intrinsic dynamics of a system coupled to an environment, and is applicable to any physical system, living or otherwise. It leads to formal definitions of several concepts that have been intuitively understood to be related to semantic information, including "value of information", "semantic content", and "agency"

arXiv.org e-Print Archive

Nonlinear Information Bottleneck

Author: Kolchinsky Artemy
Tracey Brendan D.
Wolpert David H.
Publication venue
Publication date: 30/11/2019
Field of study

Information bottleneck (IB) is a technique for extracting information in one random variable

X

that is relevant for predicting another random variable

Y

. IB works by encoding

X

in a compressed "bottleneck" random variable

M

from which

Y

can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete

X

and

Y

with small state spaces, and continuous

X

and

Y

with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous

X

and

Y

, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Modularity and the spread of perturbations in complex dynamical systems

Author: Gates Alexander J.
Kolchinsky Artemy
Rocha Luis M.
Publication venue: 'American Physical Society (APS)'
Publication date: 23/12/2015
Field of study

We propose a method to decompose dynamical systems based on the idea that modules constrain the spread of perturbations. We find partitions of system variables that maximize 'perturbation modularity', defined as the autocovariance of coarse-grained perturbed trajectories. The measure effectively separates the fast intramodular from the slow intermodular dynamics of perturbation spreading (in this respect, it is a generalization of the 'Markov stability' method of network community detection). Our approach captures variation of modular organization across different system states, time scales, and in response to different kinds of perturbations: aspects of modularity which are all relevant to real-world dynamical systems. It offers a principled alternative to detecting communities in networks of statistical dependencies between system variables (e.g., 'relevance networks' or 'functional networks'). Using coupled logistic maps, we demonstrate that the method uncovers hierarchical modular organization planted in a system's coupling matrix. Additionally, in homogeneously-coupled map lattices, it identifies the presence of self-organized modularity that depends on the initial state, dynamical parameters, and type of perturbations. Our approach offers a powerful tool for exploring the modular organization of complex dynamical systems

arXiv.org e-Print Archive

Access to Research and Communications Annals

Crossref

Thermodynamics of computing with circuits

Author: Kolchinsky Artemy
Wolpert David Hilton
Publication venue
Publication date: 29/04/2020
Field of study

Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate the minimal additional entropy production that arises when we run a given gate in a circuit. We then build on this calculation, to analyze how the thermodynamic costs of implementing a computation with a full circuit, comprising multiple connected gates, depends on the topology of that circuit. This analysis provides a rich new set of optimization problems that must be addressed by any designer of a circuit, if they wish to minimize thermodynamic costs.Comment: 26 pages (6 of appendices), 5 figure

arXiv.org e-Print Archive